European Heart Journal - Digital Health — Latest Matching Preprints

1

Beyond Doppler: Scalable AI Detection of LVOT Obstruction in HCM

Crystal, O.; Farina, J. M. M.; Scalia, I. G.; Ayoub, C.; Park, H.-B.; Kim, K. A.; Arsanjani, R.; Lester, S. J.; Banerjee, I.

2026-04-20 cardiovascular medicine 10.64898/2026.04.17.26351151 medRxiv

Top 0.1%

25.9%

Show abstract

BackgroundAccurate assessment of left ventricular outflow tract (LVOT) gradients is critical for hypertrophic cardiomyopathy (HCM) management, yet Doppler-based measurements are technically demanding and require expertise. ObjectiveTo develop a multi-view deep learning model capable of classifying LVOT obstruction (> 20mmHg) using routine 2D echocardiographic windows without reliance on Doppler imaging. MethodsWe trained and externally validated a cross-attention-based video-to-video fusion framework that integrated EchoPrime-derived video representations from three standard transthoracic echocardiographic views to classify LVOT gradients. ResultsTraining was performed on a derivation cohort (N = 1833) from a tertiary care system in the United States, with model performance evaluated on an internal held-out test set (N = 275) and a Korean external validation cohort (N = 46). Single-view baselines showed limited discrimination (external AUROCs 0.47-0.70). Conversely, domain-specific foundational model (EchoPrime) achieved superior single-view performance (AUROCs 0.75-0.80 internal; 0.79-0.83 external), highlighting the importance of echo-specific pretraining and temporal modeling. The proposed multi-view fusion further enhanced predictive performance, with the late fusion model reaching an AUROC of 0.84 on the external cohort with significant population-shift. ConclusionsThese results suggest LVOT physiology is encoded in routine 2D imaging and can be leveraged for clinically relevant gradient classification without Doppler input- proposed AI-guided strategy demonstrates substantial cost savings compared with the screen-all approach. By integrating complementary spatial-temporal information across multiple views, our approach generalizes robustly across populations and may enable real-time decision support, extend LVOT assessment to portable or resource-limited settings, and complement Doppler-based evaluation for longitudinal HCM management.

2

Comparison of the Expert Guidelines With Artificial Intelligence-Driven Echocardiographic Assessment of Diastolic Function

Tokodi, M.; Kagiyama, N.; Pandey, A.; Nakamura, Y.; Akama, Y.; Takamatsu, S.; Toki, M.; Kitai, T.; Okada, T.; Lam, C. S.; Yanamala, N.; Sengupta, P.

2026-04-24 cardiovascular medicine 10.64898/2026.04.23.26350072 medRxiv

Top 0.1%

10.0%

Show abstract

Backgound: Accurate assessment of diastolic function and left ventricular (LV) filling pressure is central to heart failure diagnosis and risk stratification. Contemporary guideline algorithms rely on complex parameters that are not consistently available in routine clinical practice. Objective: To compare the diagnostic and prognostic performance of the 2016 American Society of Echocardiography/European Association of Cardiovascular Imaging (ASE/EACVI) and 2025 ASE guidelines with a deep learning model based on routinely acquired echocardiographic variables. Methods: This study evaluated the guideline-based algorithms and a deep learning model in participants from the Atherosclerosis Risk in Communities (ARIC) cohort (n=5450) for prognostication and two invasive hemodynamic validation cohorts from the United States (n=83) and Japan (n=130) for detection of elevated left ventricular filling pressure. Results: In the ARIC cohort, the deep learning model demonstrated superior prognostic performance compared with the 2016 and 2025 guidelines (C-index: 0.676 vs. 0.638 and 0.602, respectively; both p<0.001). Similar findings were observed among participants with preserved ejection fraction (C-index: 0.660 vs. 0.628 and 0.590; both p<0.001), with improved performance compared with the H2FPEF score (C-index: 0.660 vs. 0.607; p<0.001). In the US hemodynamic validation cohort, the deep learning model showed higher diagnostic performance than the 2025 guidelines (AUC: 0.879 vs. 0.822; p=0.041) and similar performance compared with the 2016 guidelines (AUC: 0.879 vs. 0.812; p=0.138). In the Japanese hemodynamic validation cohort, the deep learning model outperformed both guidelines (AUC: 0.816 vs. 0.634 and 0.694; both p<0.05). Conclusions: A deep learning model leveraging routinely available echocardiographic parameters demonstrated improved diagnostic and prognostic performance compared with contemporary guideline-based approaches, potentially offering a scalable alternative for assessing diastolic function and left ventricular filling pressures.

3

DIVAID: Consistent division of atrial geometries from multimodal imaging according to the EHRA/EACVI 15-segment bi-atrial model

Goetz, C.; Eichenlaub, M.; Schmidt, K.; Wiedmann, F.; Invers Rubio, E.; Martinez Diaz, P.; Luik, A.; Althoff, T.; Schmidt, C.; Loewe, A.

2026-04-23 cardiovascular medicine 10.64898/2026.04.22.26351448 medRxiv

Top 0.1%

9.8%

Show abstract

The recently published EHRA/EACVI consensus statement on a standardized bi-atrial regionalization provides new opportunities for consistent regional analyses across patients, imaging modalities and clinical centers. To make this standardized regionalization widely accessible, we developed the open-source software DIVAID, which automatically divides bi-atrial geometries according to the proposed regions, ensuring consistency, reproducibility and operator independence. We evaluated the accuracy of the algorithm by comparing its results to manual expert annotations across 140 geometries from multiple modalities and centers. Veins were automatically clipped correctly in 81% and orifices annotated correctly in 100% of cases. The median (interquartile range; IQR) Dice similarity coefficient (DSC) for left atrial regions was 0.98 (0.96-1.00) for DIVAID-expert and 0.98 (0.94-1.00) for inter-expert comparisons. For right atrial geometries, DSC was higher for DIVAID-expert than for inter-expert comparisons at 0.90 (0.80-0.95) and 0.88 (0.74-0.94), respectively. To assess the accuracy of regional boundaries, we computed the mean average surface distance (MASD) for boundaries derived from automatic or manual annotations. The median (IQR) MASD between DIVAID and experts was 0.17 mm (0.03-0.78) and 1.93 mm (0.65-3.96) in the left and right atrium, respectively. To conclude, DIVAID robustly divides anatomically diverse bi-atrial geometries according to the 15-segment model, while outperforming cardiac experts in both speed and consistency, and demonstrating an accuracy of regional boundaries comparable to the spatial resolution of cardiac imaging modalities. By providing automated, consistent atrial regionalization, DIVAID enables large-scale, standardized regional analyses and data-driven investigation of harmonized, multi-dimensional datasets, which may advance atrial arrhythmia research and personalized treatment strategies.

4

Multimodal Integration of Ambulatory ECG and Clinical Features for Sudden Cardiac Death and Pump Failure Death Prediction

Swee, S.; Adam, I.; Zheng, E. Y.; Ji, E.; Wang, D.; Speier, W.; Hsu, J.; Chang, K.-W.; Shivkumar, K.; Ping, P.

2026-04-22 cardiovascular medicine 10.64898/2026.04.21.26351421 medRxiv

Top 0.1%

6.4%

Show abstract

Ambulatory electrocardiograms (ECG) provides continuous monitoring of the hearts electrical activity. However, many existing machine learning and artificial intelligence models for analyzing ambulatory ECG traces are often unimodal and do not incorporate patient clinical context. In this study, we propose a multimodal framework integrating ambulatory ECG-derived representations with clinical text embeddings to predict two cardiac outcomes: sudden cardiac death and pump failure death. Ambulatory ECG traces are preprocessed, segmented, and encoded via a multiple instance learning and temporal convolutional neural network framework. In parallel, patient clinical features are parsed into structured prompts, which are passed through a large language model to generate clinical reasoning; this reasoning passes through a biomedical language encoder to generate a text embedding. With the ECG and text embeddings, we systematically evaluate multiple fusion strategies, including concatenation- and gating-based approaches, to integrate these two data modalities. Our results demonstrate that multimodal models consistently outperform unimodal baselines, with adaptive fusion mechanisms providing the greatest improvements in predictive performance. Decision curve analysis highlights the potential clinical utility of the proposed framework for risk stratification. Finally, we visualize model attention across modalities, including ECG attention patterns, segment-level saliency, heart rate variability features, and clinical reasoning, to contextualize patient-specific predictions.

5

Vision Language Model for Coronary Angiogram Analysis and Report Generation: Development and Evaluation Study

Jiang, Q.; Ke, Y.; Sinisterra, L. G.; Elangovan, K.; Li, Z.; Yeo, K. K.; Jonathan, Y.; Ting, D. S. W.

2026-04-21 cardiovascular medicine 10.64898/2026.04.19.26351241 medRxiv

Top 0.1%

6.4%

Show abstract

Coronary artery disease is a leading cause of morbidity and mortality. Invasive coronary angiography is currently the gold standard in disease diagnosis. Several studies have attempted to use artificial intelligence (AI) to automate their interpretations with varying levels of success. However, most existing studies cannot generate detailed angiographic reports beyond simple classification or segmentation. This study aims to fine-tune and evaluate the performance of a Vision-Language Model (VLM) in coronary angiogram interpretation and report generation. Using twenty-thousand angiogram keyframes of 1987 patients collated across four unique datasets, we finetuned InternVL2-4B model with Low-Rank Adaptor weights that can perform stenosis detection, anatomy labelling, and report generation. The fine-tuned VLM achieved a precision of 0.56, recall of 0.64, and F1-score of 0.60 for stenosis detection. In anatomy segmentation, it attained a weighted precision of 0.50, recall of 0.43, and F1-score of 0.46, with higher scores in major vessel segments. Report generation integrating multiple angiographic projection views yielded an accuracy of 0.42, negative predictive value of 0.58 and specificity of 0.52. This study demonstrates the potential of using VLM to streamline angiogram interpretation to rapidly provide actionable information to guide management, support care in resource-limited settings, and audit the appropriateness of coronary interventions. AUTHOR SUMMARYCoronary artery disease has heavy disease burden worldwide and coronary angiogram is the gold standard imaging for its diagnosis. Interpreting these complex images and producing clinical reports require significant expertise and time. In this study, we fine-tuned and investigated an open-source VLM, InternVL2-4B, to interpret and report coronary angiogram images in key tasks including stenosis detection, anatomy identification, as well as full report generation. We also referenced the fine-tuned InternVL2-4B against state-of-the-art segmentation model, YOLOv8x, which was evaluated on the same test sets. We examined how machine learning metrics like the intersection over union score may not fully capture the clinical accuracy of model predictions and discussed the limitations of relying solely on these metrics for evaluating clinical AI systems. Although the model has not yet achieved expert-level interpretation, our results demonstrate the potential and feasibility of automating the reporting of coronary angiograms. Such systems could potentially assist cardiologists by improving reporting efficiency, highlightning lesions that may require review, and enabling automated calculations of clinical scores such as the SYNTAX score.

6

Persistent Atrial Myopathy Despite Ventricular Recovery: Prognostic Significance of Discordant LV-LA Strain Patterns in HFrEF

Park, J.; Hwang, I.-C.; Kim, H.-K.; Bae, N. Y.; Lim, J.; Kwak, S.; Bak, M.; Choi, H.-M.; Park, J.-B.; Yoon, Y. E.; Lee, S. P.; Kim, Y.-J.; Cho, G.-Y.

2026-04-23 cardiovascular medicine 10.64898/2026.04.22.26351480 medRxiv

Top 0.2%

3.9%

Show abstract

Aims: Assessment of treatment response in HFrEF has largely relied on left ventricular (LV)-centric parameters, yet the left atrium (LA) plays a central role in modulating LV filling and reflects the cumulative hemodynamic burden. Whether discordant recovery between LV and LA function carries distinct prognostic implications in patients treated with ARNI-based therapy remains unknown. Methods and results: From the multicenter STRATS-HF-ARNI registry, 1,182 patients with HFrEF who underwent serial echocardiography at baseline and one-year follow-up were included. Patients were classified into four strain recovery phenotypes according to the direction of change in LVGLS and LASr at one year: Group A, concordant recovery (57.4%); Group B, discordant atrial non-recovery (11.2%); Group C, discordant ventricular non-recovery (15.6%); and Group D, concordant non-recovery (16.0%). Clinical outcomes included all-cause mortality, cardiovascular mortality, and HF hospitalization. Despite achieving LV functional improvement, Group B exhibited persistent LASr deterioration, accompanied by less favorable hemodynamic trajectories compared with Group A. On multivariable Cox regression, Group B was associated with significantly higher risks of all-cause mortality (adjusted hazard ratio [aHR] 3.53, 95% confidence interval [CI] 1.60-7.79) and cardiovascular mortality (aHR 5.68, 95% CI 1.91-16.92), comparable to Group D. Group C demonstrated higher HF hospitalization risk (aHR 2.25, 95% CI 1.31-3.86). The adverse prognostic impact of discordant atrial non-recovery was consistently observed across subgroups stratified by baseline LVGLS and LASr levels. Conclusion: In HFrEF patients treated with ARNI-based therapy, persistent LA dysfunction despite LV functional improvement identifies a high-risk phenotype comparable to concordant non-recovery. These findings suggest that concurrent assessment of LV and LA strain may provide incremental prognostic value beyond LV-centric metrics alone.

7

Drug-Target Mendelian Randomization and Imaging Mediation Analyses Reveal Therapeutic Targets and Causal Mechanisms for Cardiomyopathies

Wang, P.; Song, Y.; Zhang, B.; Yang, J.

2026-04-22 cardiovascular medicine 10.64898/2026.04.20.26351344 medRxiv

Top 0.2%

3.5%

Show abstract

Abstract Background: Hypertrophic (HCM) and dilated (DCM) cardiomyopathy constitute the principal phenotypes of primary cardiomyopathy, yet both lack sufficient therapeutic options. Integrating genetic insights with detailed cardiac phenotyping offers a promising strategy to prioritize targets and elucidate their mechanisms of action. Methods: We conducted an three-stage analysis. First, drug-target Mendelian randomization (MR) was performed using cis-acting protein (pQTL) and expression (eQTL) quantitative trait loci as genetic instruments for potential drug targets. Second, we examined causal associations between 82 cardiac magnetic resonance (CMR)-derived imaging traits and HCM/DCM risk in a CMR-based MR analysis. Third, mediation MR was employed to quantify the proportion of the genetic effect of prioritized drug targets on cardiomyopathy risk that was mediated through specific CMR phenotypes. Results: Our analyses identified 19 and 13 potential therapeutic targets for HCM and DCM, respectively. CMR-based MR revealed that HCM risk was causally associated with increased right ventricular ejection fraction (RVEF) and greater left ventricular wall thickness, whereas DCM risk was linked to ventricular dilation, impaired myocardial strain, and altered aortic dimensions. Critically, mediation analysis established that these CMR traits served as significant intermediate pathways. The protective effect of ALPK3 on HCM risk was mediated through a reduction in myocardial wall thickness. Conversely, the effects of PDLIM5, HSPA4, and FBXO32 on DCM risk were exerted in part via alterations in aortic dimensions. Conclusion: This integrative genetic and imaging study systematically identify candidate therapeutic targets for HCM and DCM and delineates the specific CMR phenotypes through which they likely exert their causal effects. Our findings advance the understanding of disease pathogenesis and highlight new possibilities for improving the diagnosis and management of cardiomyopathy.

8

A Systematic Exploration of LLM Behavior for EHR phenotyping

Yamga, E.; Murphy, S.; Despres, P.

2026-04-24 health informatics 10.64898/2026.04.16.26350890 medRxiv

Top 0.4%

1.3%

Show abstract

Background Electronic health record (EHR) phenotyping underpins observational research, cohort discovery, and clinical trial screening. Large language models (LLMs) offer new capabilities for extracting phenotypes from unstructured text, but their performance depends on pipeline design choices-including prompting, text segmentation, and aggregation. No systematic framework has previously examined how these parameters shape accuracy and reproducibility. Methods We evaluated LLM-based phenotyping pipelines using 1,388 discharge summaries across 16 clinical phenotypes. A full factorial experiment with LLaMA-3B, 8B, and 70B systematically varied three pipeline components: prompting (zero-shot, few-shot, chain-of-thought, extract-then-phenotype), chunking (none, naive, document-based), and aggregation (any-positive, two-vote, majority), yielding 24 configurations per model. To compare intrinsic model capabilities, biomedical domain-adapted, commercial frontier (LLaMA-405B, GPT-4o, Gemini Flash 2.0), and reasoning-optimized models (DeepSeek-R1) were evaluated under a fixed configuration. Performance was assessed using precision, recall, and macro-F1; secondary analyses examined prediction consistency (Shannon entropy), self-confidence calibration, and the development of a taxonomy of recurrent model errors. Results Factorial ANOVAs showed that chunking and aggregation were the dominant drivers of performance, whereas the prompting strategy contributed minimally. Configuration effects were stable across model sizes, with no significant Model x Parameter interactions. Phenotype difficulty varied substantially (macro-F1 = 0.40-0.90), yet the highest-performing configuration-whole-document inference without aggregation-was consistent across phenotypes, as confirmed by mixed-effects modeling. In cross-model comparisons, DeepSeek-R1 achieved the highest macro-F1 (0.89), while LLaMA-70B matched GPT-4o and LLaMA-405B at substantially lower cost. Prediction entropy was low overall and driven primarily by phenotype difficulty rather than prompting or temperature. Self-confidence calibration was only moderately informative: high-confidence predictions were more accurate, but larger models exhibited systematic overconfidence. Conclusions LLM performance in EHR phenotyping is governed primarily by input structure and model capacity, not prompt engineering. Simple, document-level inference yields robust performance across diverse phenotypes, providing practical design guidance for LLM-based cohort identification while underscoring the continued need for human oversight for challenging phenotypes.

9

The Golden Opportunity or the Cutting Room Floor? Quantifying and Characterizing the Loss and Addition of Social Determinants of Health during Clinician Editing of Ambient AI Documentation

Kim, S.; Guo, Y.; Sutari, S.; Chow, E.; Tam, S.; Perret, D.; Pandita, D.; Zheng, K.

2026-04-22 health systems and quality improvement 10.64898/2026.04.20.26351322 medRxiv

Top 0.5%

0.9%

Show abstract

Social determinants of health (SDoH) are important for clinical care, but it remains unclear how much AI-captured social context is preserved after clinician editing in ambient documentation workflows. We retrospectively analyzed 75,133 paired ambient AI-drafted and clinician-finalized note sections from ambulatory care at a large academic health system. Using a rule-based NLP pipeline, we extracted 21 SDoH categories and quantified retention, deletion, and addition. SDoH appeared in 25.2% of AI drafts versus 17.2% of final notes. At the mention level, AI captured 29,991 SDoH mentions, of which 45.1% were deleted, 54.9% were retained with clinicians adding 3,583 new mentions. Insurance and marital status were most often deleted, whereas substance use and physical activity were more often retained. Deletion patterns also varied by specialty, supporting the need for specialty-aware ambient AI systems.

10

BRIDGE: a barrier-informed Bayesian Risk prediction model for risk IDentification, trajectory Grouping, and profiling of non-adherencE to cardioprotective medicines in primary care

Koh, H. J. W.; Trin, C.; Ademi, Z.; Zomer, E.; Berkovic, D.; Cataldo Miranda, P.; Gibson, B.; Bell, J. S.; Ilomaki, J.; Liew, D.; Reid, C.; Lybrand, S.; Gasevic, D.; Earnest, A.; Gasevic, D.; Talic, S.

2026-04-22 pharmacology and therapeutics 10.64898/2026.04.21.26351387 medRxiv

Top 0.5%

0.9%

Show abstract

BackgroundNon-adherence to lipid-lowering therapy (LLT) affects up to half of patients and contributes substantially to preventable cardiovascular morbidity and mortality. Existing measures, such as the proportion of days covered, provide cross-sectional summaries but fail to capture the dynamic patterns of adherence over time. Although group-based trajectory modelling identifies distinct longitudinal adherence patterns, no approach currently predicts trajectory membership prospectively while incorporating patient-reported barriers. We developed BRIDGE, a barrier-informed Bayesian model to predict adherence trajectories and identify their underlying drivers. MethodsBRIDGE incorporates patient-reported barriers as structured prior information within a Bayesian framework for adherence-trajectory prediction. The model was designed not only to estimate which patients are likely to follow different adherence trajectories, but also to generate clinically interpretable probability estimates that help explain why those trajectories may arise and what modifiable factors may be most relevant for intervention. ResultsBRIDGE achieved a macro AUROC of 0.809 (95% CI 0.806 to 0.813), comparable to random forest (0.815 (95% CI 0.812 to 0.819)) and XGBoost (0.821 (95% CI 0.818 to 0.824)), two widely used machine-learning benchmarks for structured clinical prediction. Calibration was superior to random forest (Brier score 0.530 vs 0.545; ), and performance was stable across six independent training runs (AUROC SD = 0.003). Incorporating barrier-informed priors improved accuracy by 3.5% and calibration by 5.5% compared to flat priors, showing that incorporation of patient-reported barriers added value beyond electronic medical record data alone. Four clinically distinct adherence trajectories were identified: gradual decline associated with treatment deprioritisation amid polypharmacy (10.4%), early discontinuation linked to asymptomatic risk dismissal (40.5%), rapid decline associated with intolerance (28.8%), and persistent adherence (20.2%). Counterfactual analysis identified trajectory-specific intervention levers. ConclusionsBRIDGE provides accurate and well-calibrated prediction of adherence trajectories while offering clinically actionable insights into their underlying drivers. By integrating patient-reported barriers with routine clinical data, the model supports targeted, mechanism-informed interventions at the point of prescribing to improve adherence to cardioprotective therapies. FundingMRFF CVD Mission Grant 2017451 Evidence before this studyWe searched PubMed and Scopus from database inception to December 2025 using the terms "medication adherence", "trajectory", "prediction model", "Bayesian", "lipid-lowering therapy", and "barriers", with no language restrictions. Group-based trajectory modelling has consistently identified three to five adherence patterns across cardiovascular cohorts; however, these applications have been descriptive rather than predictive. Machine-learning models for adherence prediction achieve moderate discrimination but treat adherence as a binary or continuous outcome, thereby overlooking the clinically meaningful heterogeneity captured by trajectory approaches. One prior study applied a Bayesian dynamic linear model to examine adherence-outcome associations, but it did not predict adherence trajectories or incorporate patient-reported barriers. To our knowledge, no published model integrates patient-reported barriers into trajectory prediction. Added value of this studyBRIDGE is, to our knowledge, the first model to incorporate patient-reported adherence barriers as hierarchical domain-informed priors within a Bayesian framework for trajectory prediction. Using 108 predictors derived from routine electronic medical records, the model achieves discrimination comparable to state-of-the-art machine-learning approaches while additionally providing uncertainty quantification, barrier-level interpretability, and counterfactual insights to inform intervention strategies. The identified trajectories differed not only in adherence level but also in switching behaviour, drug-class evolution, and medication burden, suggesting distinct underlying mechanisms of non-adherence that may require tailored clinical responses. Implications of all the available evidenceEach adherence trajectory implies a distinct intervention target: asymptomatic risk communication for early discontinuers (40.5% of patients), proactive tolerability management for rapid decliners, medication simplification for patients with gradual decline associated with polypharmacy, and maintenance support for persistent adherers. By integrating routinely collected clinical data with patient-reported barriers, BRIDGE can be deployed within existing primary care EMR infrastructure to generate actionable, trajectory and patient--specific recommendations at the point of prescribing, helping to bridge the gap between adherence measurement and targeted adherence management.

11

Generalizing intensive care AI across time scales in resource-limited settings

Devadiga, A.; Singh, P.; Sankar, J.; Lodha, R.; Sethi, T.

2026-04-24 health informatics 10.64898/2026.04.23.26351588 medRxiv

Top 0.5%

0.8%

Show abstract

Temporal resolution of physiological monitoring in intensive care varies widely across healthcare systems. Artificial intelligence models assume a uniform and fixed frequency of sampling, thus limiting the generalizability of models, especially to resource-limited settings. Here, we propose a novel resolution-transfer task for physiological time series and ask whether models trained on high-resolution data can generalize to a low data-density setting without the need to retrain them. SafeICU, a novel longitudinal pediatric intensive care dataset spanning ten years from a tertiary care hospital in India, was used to test this hypothesis. Self-supervised transformer models were trained on 144,271 patient-hours of high-resolution physiological signals from 984 pediatric ICU stays to learn representations of heart rate, respiratory rate, oxygen saturation, and arterial blood pressure. Transfer of this model to low-resolution data established robust performance in clinically relevant lower-frequency intervals, consistently outperforming models trained directly at coarser resolutions. Further, these representations generalized across patient populations, maintaining performance when evaluated on adult intensive care cohorts from the MIMIC-III and eICU databases without retraining. In a downstream task of early shock prediction, models achieved strong discrimination in the pediatric cohort (area under the receiver operating characteristic curve (AUROC) 0.87; area under the precision-recall curve (AUPRC) 0.92) and retained stable performance across monitoring intervals from 10 to 60 minutes (AUROC 0.78-0.88). Together, these results demonstrate that physiological representations learned from high-resolution data enable time-scale-robust and transferable AI for intensive care. The publicly released SafeICU dataset, comprising longitudinal vital signs, laboratory measurements, treatment records, microbiology, and admission and discharge, provides a foundation for developing and deploying generalizable clinical AI in resource-limited settings.

12

Antecedent autonomic symptoms predict contemporary autonomic symptom burden and reduced health-related quality of life after spontaneous coronary artery dissection

Seeley, M.-C.; Tran, D. X. A.; Marathe, J. A.; Sharma, S.; Wilson, G.; Atkins, S.; Lau, D. H.; Gallagher, C.; Psaltis, P. J.

2026-04-23 cardiovascular medicine 10.64898/2026.04.21.26351434 medRxiv

Top 0.6%

0.8%

Show abstract

Introduction: Spontaneous coronary artery dissection (SCAD) is frequently accompanied by persistent symptoms of unknown pathogenesis after the index event. Autonomic dysfunction is a plausible mechanism for these but has not been systematically characterized. We quantified antecedent and contemporary autonomic symptoms in survivors of SCAD and examined their associations with cardiac and extra-cardiac symptoms and health-related quality of life. Methods: This cross-sectional study recruited 227 volunteers from multiple countries with a self-reported history of SCAD. Participants completed validated patient-reported measures, including the Composite Autonomic Symptom Score-31 (COMPASS-31), Anxiety Sensitivity Index-3 (ASI-3), and EuroQol-5 Dimension-5L (EQ-5D-5L). They also completed an internally derived retrospective autonomic predisposition score assessing symptoms during adolescence and early adulthood. Results: Participants were predominantly female (97.8%), median age 53 (47-58) years, and were surveyed a median of 3 (1-5) years after their index SCAD event. 21.6% reported SCAD recurrence. Moderate autonomic symptom burden (COMPASS-31 20) was present in 56.4% and severe burden (40) in 16.3%. History of antecedent autonomic symptoms was the strongest independent predictor of contemporary autonomic symptom burden after adjustment for demographic and clinical covariates (=0.514; P <0.001). Greater autonomic symptom burden independently predicted lower EQ-5D health utility (=0.150; P=0.029) and was associated with the ASI-3 physical concerns (=0.232; P <0.001), but not social concerns domain. Autonomic symptoms were not associated with SCAD recurrence. Conclusion: Symptoms of autonomic dysregulation are common in survivors of SCAD and are associated with reduced quality of life. Their association with antecedent dysautonomic features during adolescence and early adulthood suggests a longstanding predisposition, the significance of which warrants further evaluation.

13

Improving Care by FAster risk-STratification through use of high sensitivity point-of-care troponin in patients presenting with possible acute coronary syndrome in the EmeRgency department (ICare-FASTER): a stepped-wedge cluster randomized trial

Than, M.; Pickering, J. W.; Joyce, L. R.; Buchan, V. A.; Florkowski, C. M.; Mills, N. L.; Hamill, L.; Prystowsky, J.; Harger, S.; Reed, M.; Bayless, J.; Feberwee, A.; Attenburrow, T.; Norman, T.; Welfare, O.; Heiden, T.; Kavsak, P.; Jaffe, A. S.; apple, f.; Peacock, W. F.; Cullen, L.; Aldous, S.; Richards, A. M.; Lacey, C.; Troughton, R.; Frampton, C.; Body, R.; Mueller, C.; Lord, S. J.; George, P. M.; Devlin, G.

2026-04-23 cardiovascular medicine 10.64898/2026.04.21.26351433 medRxiv

Top 0.6%

0.7%

Show abstract

BACKGROUND Point-of-care (POC) high-sensitivity cardiac troponin (hs-cTn) testing has the potential to expedite decision-making and reduce emergency department (ED) length of stay for patients presenting with possible myocardial infarction (MI) by ensuring that results are consistently available when looked for by clinicians. We assessed the real-life effectiveness and safety of implementing POC hs-cTn testing in the ED. METHODS We conducted a pragmatic, stepped-wedge cluster randomized trial. The control arm was usual care with an accelerated diagnostic pathway utilizing a single-sample rule-out step with a central laboratory hs-cTn assay. The intervention arm used the same pathway with a POC hs-cTnI. The primary effectiveness outcome was ED length of stay assessed using a generalized linear mixed model, and the safety outcome was 30-day MI or cardiac death. RESULTS Six sites participated with 59,980 ED presentations (44,747 individuals, 61{+/-}19 years, 49.5% female) from February 2023 to January 2025, in which 31,392 presentations were during the intervention arm. After adjustment for co-variates associated with length of stay, the intervention reduced length of stay by 13% (95% confidence intervals [CI], 9 to 16%. P<0.001), corresponding to a reduction of 47 minutes (95%CI, 33 to 61 minutes) from a mean length of stay in the control arm of 376 minutes. The 30-day MI or cardiac death rate was similar in the control and intervention arms (0.39% and 0.39% respectively, P=0.54). CONCLUSIONS Implementation of whole-blood hs-cTnI testing at the POC into an accelerated diagnostic pathway was safe and reduced length of stay in the ED compared with laboratory testing.

14

MIMIC-IV-Phenotype-Atlas (MIPA) : A Publicly Available Dataset for EHR Phenotyping

Yamga, E.; Goudrar, R.; Despres, P.

2026-04-24 health informatics 10.64898/2026.04.16.26350888 medRxiv

Top 0.7%

0.6%

Show abstract

Introduction Secondary use of electronic health records (EHRs) often requires transforming raw clinical information into research-grade data. A central step in this process is EHR phenotyping - the identification of patient cohorts defined by specific medical conditions. Although numerous approaches exist, from ICD-based heuristics to supervised learning and large language models (LLMs), the field lacks standardized benchmark datasets, limiting reproducibility and hindering fair comparison across methods. Methods We developed the MIMIC-IV Phenotype Atlas (MIPA) dataset, an adaptation of MIMIC-IV that provides expert-annotated discharge summaries across 16 phenotypes of varying prevalence and complexity. Two independent clinicians reviewed and labeled the discharge summaries, resolving disagreements by consensus. In parallel, we implemented a processing pipeline that extracts multimodal EHR features and generates training, validation, and testing datasets for supervised phenotyping. To illustrate MIPA's utility, we benchmarked four phenotyping methods : ICD-based classifiers, keyword-driven Term Frequency-Inverse Document Frequency (TF-IDF) classifiers, supervised machine learning (ML) models, and LLMs on the task. Results The final MIPA corpus consists of 1,388 expert-annotated discharge summaries. Annotation reliability was high (mean document-level kappa = 0.805, mean label-level kappa = 0.771), with 91% of disagreements resolved through consensus review. MIPA provides high-quality phenotype labels paired with structured EHR features and predefined train/validation/test splits for each phenotype. In the benchmarking case study, LLMs achieved the highest F1 scores in 13 of 16 phenotypes, particularly for conditions requiring contextual interpretation of clinical narrative, while supervised ML offered moderate improvements over rule-based baselines. Conclusion MIPA is the first publicly available benchmark dataset dedicated to EHR phenotyping, combining expert-curated annotations, broad phenotype coverage, and a reproducible processing pipeline. By enabling standardized comparison across ICD-based heuristics, ML models, and LLMs, MIPA provides a durable reference resource to advance methodological development in automated phenotyping.

15

Echocardiographic characterization and markers of cardiovascular risk in adults with sickle cell disease in a Colombian tertiary referral centre: a cross-sectional study

Arrieta-Mendoza, M. E.; Barbosa-Balaguera, S.; Betancourt, J. R.; Ayala-Zapata, S.; Messu-Llanos, C. D.; Rosales-Melo, J. P.; Andrade-Hoyos, D. F.; Herrera-Escandon, A.; Aguilar-Molina, O. E.

2026-04-20 cardiovascular medicine 10.64898/2026.04.16.26351071 medRxiv

Top 0.7%

0.6%

Show abstract

Sickle cell disease (SCD) is associated with substantial cardiovascular morbidity, but echocardiographic data from Latin American populations remain scarce. We aimed to characterise the structural, functional, and haemodynamic echocardiographic profile of adults with SCD attending a tertiary referral centre in Cali, Colombia. We conducted an observational, cross-sectional study based on systematic review of medical records and transthoracic echocardiography reports of consecutive adult patients ([≥]18 years) with confirmed SCD evaluated between January 2022 and December 2024. Patients with complex congenital heart disease, severe valvular disease of unrelated aetiology, pregnancy, or echocardiograms of insufficient quality were excluded. Of 669 patients screened, 57 met inclusion criteria. Reporting followed STROBE recommendations. The median age was 24 years (interquartile range [IQR] 21-32) and 59.6% were female; the SS genotype was the most frequent (76.4%) and 71.4% were on hydroxyurea. Median haemoglobin was 10.2 g/dL (IQR 9.3-11.4) and median NT-proBNP 491 pg/mL (IQR 98-1290). Most patients had preserved left ventricular dimensions and systolic function (median ejection fraction 63%, IQR 57-66.5; mean global longitudinal strain -18.9% {+/-} 2.9). Right ventricular function was preserved (mean tricuspid annular plane systolic excursion 25.4 {+/-} 4.6 mm). Left ventricular geometry was normal in 42.1%, with concentric remodelling in 24.6%, concentric hypertrophy in 21.1%, and eccentric hypertrophy in 12.3%. Diastolic function was normal in 71.4%. Valvular disease, when present, was predominantly mild. Tricuspid regurgitation velocity exceeded 2.5 m/s in 29.8% of patients and exceeded 3.0 m/s in 10.5%, identifying a substantial subgroup at intermediate-to-high probability of pulmonary hypertension. In this Colombian cohort of relatively young adults with SCD, cardiac structure and biventricular function were largely preserved, but nearly one-third of patients had echocardiographic findings suggestive of pulmonary hypertension. These findings support the routine use of transthoracic echocardiography as an accessible tool for early cardiovascular risk stratification in adults with SCD in low- and middle-income settings.

16

Biventricular cardiac dynamic shape: genetics and cardiometabolic disease associations

Burns, R.; Young, W. J.; Uddin, K.; Petersen, S. E.; Ramirez, J.; Young, A. A.; Munroe, P. B.

2026-04-20 genetic and genomic medicine 10.64898/2026.04.19.26350940 medRxiv

Top 0.7%

0.5%

Show abstract

BackgroundGenetic studies using cardiac magnetic resonance (CMR) imaging have identified loci related to cardiac shape, but most focus on static morphology. The value of a dynamic cardiac shape atlas capturing both shape and function remains unknown. MethodsA dynamic shape atlas comprising CMR-derived shape models at end-diastole and end-systole was combined with genetic and outcome data in 36,992 UK Biobank participants. Dynamic shape principal components (PCs) describing >1% of variance were characterized, and tested for associations with prevalent and incident cardiometabolic diseases, including ischemic heart disease (IHD), heart failure (HF), significant atrioventricular block (AVB), and atrial fibrillation (AF), and independent predictive power alongside standard CMR measures. Genome-wide association studies (GWAS) were performed to identify candidate genes and biological pathways, and polygenic risk scores (PRS) were assessed for disease associations. Mendelian randomization (MR) was performed to test causality of observed disease associations. ResultsWe identified 14 dynamic cardiac shape PCs capturing 83.3% of total dynamic cardiac shape variance. These PCs captured distinct functional remodeling patterns such as variation in annular plane systolic excursion, while remaining only modestly correlated with standard CMR measures. All 14 PCs were associated with at least one incident cardiometabolic disease, with the strongest associations observed for incident IHD, HF, and AVB. Notably, incorporating dynamic shape PCs improved the prediction of incident IHD beyond standard CMR measures. GWAS identified 75 genetic loci associated with dynamic shape, including 14 variants previously unreported for cardiac traits, and candidate genes demonstrated enrichment in pathways related to cardiac development and contractile function. PRS derived from dynamic shape loci were significantly associated with multiple outcomes, most prominently HF. MR identified significant causal relationships between several PCs and cardiometabolic disease. ConclusionsDynamic cardiac shape features capture aspects of cardiac structure and function not fully represented by standard CMR measures. These features are strongly associated with incident cardiometabolic disease and provide new insights into the genetic architecture of cardiac remodeling. Clinical perspectiveO_ST_ABSWhat is new?C_ST_ABSO_LIGenetic and outcome relationships with a dynamic statistical shape model capturing both left and right ventricles at end-diastole and end-systole. C_LIO_LIDemonstration of incremental value over existing cardiac shape models, through capture of functional remodeling not represented by standard imaging measures. C_LIO_LIIdentification of genetic susceptibility loci for dynamic cardiac shape, including 14 variants not previously reported for cardiac traits. C_LI What are the clinical implications?O_LIThe results enhance our understanding of the genetic architecture of dynamic cardiac shape and function in the general population and clarify their relationships with other cardiovascular endophenotypes and incident cardiometabolic diseases. C_LIO_LINewly identified candidate genes expand the biological pathways implicated in cardiac remodeling and provide targets for future functional and mechanistic studies. C_LIO_LIThe improved prediction of incident cardiometabolic disease, particularly ischemic heart disease, achieved by adding dynamic shape PCs to traditional CMR measures suggests potential value for their inclusion in evaluation of patients. C_LI

17

Interleukin-1 Receptor Antagonist Levels In Patients With Heart Failure And Reduced Ejection Fraction Treated With Anakinra

Kelly, J.; Mezzaroma, E.; Roscioni, A.; McSkimming, C.; Mauro, A.; Narayan, P.; Golino, M.; Trankle, C.; Canada, J. M.; Toldo, S.; Van Tassell, B. W.; Abbate, A.

2026-04-25 cardiovascular medicine 10.64898/2026.04.17.26351024 medRxiv

Top 0.7%

0.5%

Show abstract

Background. Patients with heart failure and reduced ejection fraction (HFrEF) commonly show signs of systemic inflammation. Interleukin-1 (IL-1) is a pro-inflammatory cytokine, known to modulate cardiac function. We aimed to determine the effects of treatment with anakinra, recombinant IL-1 receptor antagonist (IL-1Ra), on plasma IL-1Ra levels. Methods. We measured IL-1Ra levels at baseline and longest available follow-up to 24 weeks in 63 patients (44 males, 40 self-identified Black-Americans) with recent hospitalization for HFrEF, and systemic inflammation (C reactive protein [CRP] levels >2 mg/L) who were assigned to anakinra (N=42 [66.7%]) or placebo (N=21 [33.3%]) as part of the REDHART2 clinical trial (NCT0014686). Cardiorespiratory fitness was measured as peak oxygen consumption (peak VO2). Results. Baseline plasma IL-1Ra levels were 380 pg/ml (290 to 1046). On-treatment IL-1Ra levels were significantly higher in the patients treated with anakinra vs placebo (3,994 pg/ml [3,372 to 5,000] vs 492 pg/ml [304 to 1370], P<0.001). The longest available follow-up was 6 weeks in 10 patients (15.9%), 12 weeks in 12 patients (19%) and 24 weeks in 41 patients (65.1%). On-treatment IL-1Ra levels and interval change in IL-1Ra showed a modest inverse correlation with on-treatment CRP levels (R=-0.269, P=0.033 and R=-0.355, P=0.004, respectively) and no statistically significant correlations with peak VO2 values (P>0.05). Conclusions. Patients with recently decompensated HFrEF and systemic inflammation treated with recombinant IL-1Ra, anakinra, have a significant several-fold increase in plasma IL-1Ra levels. On-treatment IL-1Ra levels however show only a modest correlation with CRP levels and not with peak VO2.

18

MedSAM2-CXR: A Box-Latent Framework for Chest X-ray Classification and Report Generation

Hakata, Y.; Oikawa, M.; Fujisawa, S.

2026-04-22 health informatics 10.64898/2026.04.20.26351338 medRxiv

Top 0.8%

0.4%

Show abstract

Who is affectedIn Japan, approximately 100 million chest radiographs (CXRs) are acquired annually, while only about 7,000 board-certified diagnostic radiologists practice nationwide (Japan Radiological Society workforce statistics; OECD Health Statistics, most recent available year). This implies an average workload exceeding 10,000 imaging studies per radiologist per year if all CXRs were attributed to board-certified diagnostic radiologists (an upper-bound estimate, because in practice many CXRs are primarily read by non-radiologist physicians). In settings such as night shifts, weekends, remote islands, and regional care networks, non-radiologist physicians frequently act as primary readers. Despite strong demand for AI assistance, existing systems are typically limited by one of three shortcomings -- poor cross-institutional generalization, limited interpretability, or inability to generate draft reports -- and consequently see limited clinical deployment. What we builtWe propose a Box-Latent Trinity that embeds each image as a hyperrectangle parameterized by a center c and a radius r, rather than as a single point in a latent space. We further introduce BL-TTA (Box-Latent Test-Time Augmentation), which approximately closes the train-inference gap (exact in the N [->] {infty} limit; N = 8 suffices in practice) by averaging predictions over samples drawn from within the latent box at inference time. Both components are implemented on top of the frozen MedSAM2 medical imaging foundation model. A single box representation simultaneously supports three functions: (A) theoretically grounded source selection, (B) device-invariant augmentation, and (C) case-based retrieval-augmented generation (RAG). Each prediction is accompanied by retrieved similar prior cases, a calibrated confidence estimate, and clinical-guideline references. How well it performsOn the Open-i CXR corpus (2,954 image-report pairs) under a patient-level 80/10/10 split and 5-seed reproducibility, the full system B5 achieves macro area under the receiver-operating-characteristic curve (macro-AUROC) 0.639 (best-seed test; 5-seed mean 0.626, Table 2; absolute +0.015 over the strongest same-backbone baseline, Merlin-style 0.624), elementwise accuracy 0.753 (absolute +0.072 over Merlin-style 0.681 -- equivalent to approximately 7 fewer label-level errors per 100 (label, image) predictions across 14 finding labels, not per 100 images), and report label-F1 0.435 (absolute +0.086, relative +25 % over the strongest same-backbone report-generation baseline, Bootstrapping-style 0.349). Under simulated pixel-space device-shift intensities up to twice the training distribution, AUROC degrades by only 0.014. Brier score (macro) is 0.061; Cohens{kappa} between two independent rule-based label extractors is 0.702 (substantial agreement); the box radius yields an out-of-distribution (OOD) detection AUROC of 0.595; and the framework provides four structural explainable-AI (XAI) outputs -- retrieved similar cases, confidence tier, per-axis uncertainty, and visual saliency -- which we jointly quantify in a single CXR study, a combination that, to our knowledge, has not been reported previously. O_TBL View this table: org.highwire.dtl.DTLVardef@d8ced6org.highwire.dtl.DTLVardef@1f3471dorg.highwire.dtl.DTLVardef@c1c2f1org.highwire.dtl.DTLVardef@e589bdorg.highwire.dtl.DTLVardef@1b5e410_HPS_FORMAT_FIGEXP M_TBL C_TBL Path to deploymentBecause the complete experiment can be reproduced in under two hours on a consumer-grade GPU (NVIDIA RTX 4060, 8 GB VRAM), the framework can run on compute resources already available at typical healthcare institutions. The approach thus supports the practical delivery of evidence-grounded diagnostic support to night shifts, remote-island care, and secondary readings in health checkups -- settings in which a board-certified radiologist is not locally available. One-sentence summaryReproducible end-to-end in under two hours on a single consumer-grade GPU, the proposed framework outperforms the strongest same-backbone medical-AI baselines on three principal metrics, maintains accuracy under simulated device shifts, and automatically drafts evidence-grounded radiology reports, offering a reproducible and compute-efficient direction toward reducing the reading burden of Japanese radiologists, subject to external validation.

19

Patient perspectives on living with hypertension: Social media listening analysis across predominantly high-income countries

Di Somma, S.; Gervais, R.; Bains, M.; Carter-Williams, S.; Messner, S.; Onsongo, N.

2026-04-23 cardiovascular medicine 10.64898/2026.04.22.26351483 medRxiv

Top 0.8%

0.4%

Show abstract

Background: Chronic conditions such as hypertension can significantly disrupt daily life and emotional wellbeing. The interaction between patients' perceptions, adherence to antihypertensive medication and quality of life (QoL) remains underexplored outside structured clinical settings. Objectives: To capture unprompted patient perspectives and assess whether hypertension affects QoL and to investigate if patient reported experiences are associated with self-reported antihypertensive medication adherence. Methods: Social media listening (SML) study analyzing 86,368 anonymized posts from individuals with hypertension in 12 countries, collected between January 2022 and May 2024. Posts from 11 countries (n=81,368) were analyzed using artificial intelligence-enabled natural language processing. Posts from China (n=5,000) were analyzed separately using a harmonized framework. Quantitative and qualitative methods assessed variations by country, age, and gender, and associations between emotional expression and antihypertensive medication adherence. Results: Across the 11-country core sample, 45% of posts mentioned at least one QoL impact, most commonly worry/anxiety (11%). Impacts varied across countries. Among 8,096 posts with age identified, individuals <40 years reported emotional balance impacts in 28% of posts versus 22% among those aged 40+. Work/Education impacts were mentioned in 17% of posts by those <40 years vs 12% in 40+. Among 7968 posts explicitly referencing adherence, expressed worry was associated with stricter adherence (62% association score), as were structured routines (79% score), home monitoring (77%), dietary changes (77%), and exercise (71%). In contrast, sadness/depression was associated with inconsistent adherence (71%), as were forgetfulness (79%), side effects (73%), and cost/insurance concerns (65%). Conclusions: These results emphasize the importance of the psychological and emotional impact of hypertension, including on adherence to medication regimens, reinforcing the value of a holistic approach to patient care.

20

Research Paper on AuditMed: A Single-File, Browser-Based Clinical Evidence Audit Platform Architecture, Current Capabilities, and Proposed Applications in Drug Informatics and Pharmacy Education

Ferguson, D. J.

2026-04-20 health informatics 10.64898/2026.04.19.26351188 medRxiv

Top 0.9%

0.3%

Show abstract

BackgroundClinical pharmacists, trainees, and educators rely on multi-database literature retrieval and structured evidence synthesis to answer drug-information questions. Existing workflows require navigation across PubMed, DailyMed, LactMed, interaction checkers, and specialty guideline repositories with manual de-duplication, appraisal, and synthesis. Commercial platforms that integrate these functions are costly and often unavailable in community, rural, and international training contexts. ObjectiveThis report describes the architecture of AuditMed, a single-file, browser-based clinical evidence audit platform, and reports preliminary stress-test results against a complex multi-morbidity case corpus. AuditMed is intended for research and educational use and is not a substitute for clinical judgment or validated commercial clinical decision-support systems. MethodsAuditMed integrates nineteen free, publicly available clinical and biomedical application programming interfaces into a six-stage Search [->] Select [->] Parse [->] Analyze [->] Infer [->] Create pipeline and supports browser-local patient-case ingestion with regex-based HIPAA Safe Harbor de-identification. Preliminary stress-testing was conducted against eleven cases (Cases 30 through 40) from the Complex Clinical Case Compendium Software Validation Suite, each featuring over twenty concurrent active disease states. For each case, the one-click inference pipeline was executed with default settings and the full Clinical Inference Report was captured verbatim. No retrieval-sensitivity, synthesis-fidelity, or time-to-answer endpoints were pre-specified; the exercise was qualitative and oriented toward pipeline behavior under extreme multi-morbidity. ResultsThe pipeline completed without fatal errors for all eleven cases and produced a structured Clinical Inference Report in each instance. Quantitative-finding detection performed as designed for hematologic parameters and cardiac biomarkers. Two parser defects were identified and are reproduced in the appendix: an age-as-fever regex-precedence defect affecting seven cases and a diagnosis-versus-medication parsing defect affecting one case. Evidence-linkage rate varied from zero evidence-linked statements in seven cases to eleven in one case, reflecting dependence of the inference layer on MeSH-indexed literature coverage of the specific case diagnoses. ConclusionsAuditMed is an early-stage, open-source platform whose value at this stage is in providing a free, transparent, auditable workflow for multi-source evidence synthesis with explicit uncertainty flagging. The preliminary results document both robust end-to-end completion under extreme case complexity and specific, reproducible parser defects that will be addressed before formal evaluation. Planned evaluation studies are described.